Why is building a heap \(O(n)\) and not \(O(n \log n)\)? The work is not distributed evenly.
- Most nodes in a complete binary tree are leaves at the bottom. The `bubbleDown` operation on a leaf does zero work.
- About half the nodes (\(\approx n/2\)) are leaves.
- About a quarter of the nodes (\(\approx n/4\)) are one level up. They do at most 1 swap.
- The amount of work done per node increases as we get closer to the root, but the number of nodes at those levels decreases exponentially.
- The total work is a sum that mathematically converges to be proportional to \(n\), not \(n \log n\).
Formal Analysis
The total work is the sum of the work done at each height \(h\): \[\sum_{h=0}^{\log n} \frac{n}{2^{h+1}} \cdot O(h) \approx n \sum_{h=0}^{\log n} \frac{h}{2^{h+1}} \approx O(n)\]
Initial state